Four SQL clauses allow you to conduct full-text searches on your full-text index tables:
CONTAINS— Specifies a strict exact match, with options to make the search flexible.
CONTAINSTABLE— Returns a ranked rowset from SQL Server FTS implementing the Contains algorithm, which must be joined against the base table.
FREETEXT— Specifies a stemmed search that returns results to all generations of the search phrase.
FREETEXTTABLE— Returns a ranked rowset from SQL Server FTS implementing the FreeText algorithm, which must be joined against the base table.
CONTAINS and CONTAINSTABLE
The CONTAINS and CONTAINSTABLE predicates have the following parameters:
Search Phrase
The search phrase is the
phrase or word that you are looking for in a full-text indexed table. If
you are searching for more than one word, you have to wrap your search
phrase in double quotation marks, as in this example:
SELECT * FROM Person.Contact
WHERE CONTAINS(*,'"search phrase"') — search all columns
In this query, you are
searching all full-text indexed columns. However, you can search a
single column, a list of columns, or all columns. The following example
shows how:
SELECT * FROM Person.Contact
WHERE CONTAINS(FirstName, '"search phrase"') — searching 1 column
SELECT * FROM Person.Contact
WHERE CONTAINS((FirstName,Lastname), '"search phrase"') — searching 2 columns
You can also use Boolean operators in your search phrase, as in this example:
SELECT * FROM Person.Contact WHERE CONTAINS(*, '"Ford"
AND NOT ("Harrison" OR "Betty")')
This example searches on Ford cars, where you don’t want hits to rows that contain references to Harrison and Ford or Betty and Ford.
CONTAINS supports Boolean AND, OR, and AND NOT but not OR NOT.
You can also use wildcards in your searches by adding the *
to the end of a word in your search phrase. A wildcard added to one
word acts as wildcard on all words in the search phrase, so a search on Al Anon* matches with Alcoholics Anonymous, Al Anon, and Alexander Anonuevo.
Generation
The term generation refers to all forms of a word, which could be the word itself, all declensions (that is, singular or plural forms, such as book and books), conjugations of a word (such as book, booked, booking, and books), and thesaurus replacements and substitutions of a word. To search on all generations of a word, you use a FREETEXT search on the formsOf predicate. The following example shows how to use the formsOf predicate to search on declensions and conjugations of a word:
SELECT * FROM Person.Contact WHERE CONTAINS(*,'formsOf(inflectional,book)')
Generations of a word also include its thesaurus expansions and replacements. An expansion is the word and other synonyms of the word (for example, book and volume or car and automobile). An expansion can also include alternate spellings, abbreviations, and nicknames. A replacement is a word that you want replaced in a search. For example, if you have users searching on the word sex, and you want sex interpreted as gender, you can replace the search on the term sex with a search on the word gender.
To get the thesaurus option to work, you need to edit the thesaurus
file for your language. By default, the thesaurus files are in C:\Program Files\Microsoft SQL Server\MSSQL.X\MSSQL\FTData, where X is the instance number. There is a thesaurus file for each full-text supported language; it is named TSXXX.XML, where XXX is a three-letter identifier for the language. There also is another thesaurus file called TSGlobal.XML. Changes made to the TSGlobal
thesaurus file are effective in all languages but are overridden by the
language-specific thesaurus files. To make the thesaurus file
effective, you have to remove the comment marks and then restart MSFTESQL (the Microsoft SQL Server Full-Text Search service). Notice that the thesaurus files have an XML element called <diacritics = true/>. Setting this element to false makes the thesaurus not sensitive to accents; otherwise, the thesaurus file is accent sensitive.
As mentioned previously,
the thesaurus file has two sections: an expansion section and a
replacement section. The expansion section looks like this:
<expansion>
<sub>Internet Explorer</sub>
<sub>IE</sub>
<sub>IE5</sub>
<sub>IE6</sub>
</expansion>.
The sub nodes refer to substitutes, so a search on Internet Explorer is substituted to additional searchers on Internet Explorer, IE, IE5, and IE6.
The replacement section looks like this:
<replacement>
<pat>NT5</pat>
<pat>W2K</pat><sub>Windows 2000</sub>
</replacement>
Here, searches on the patterns NT5 or W2K are replaced by a search on Windows 2000, so your search will never find rows containing only the words NT5 or W2K.
To use the thesaurus option, you need to use the formsOf predicate. Here is an example of a formsOf query:
SELECT * FROM Person.Contact WHERE CONTAINS(*, 'formsof(thesaurus,ie)')